Goto

Collaborating Authors

 nh 2





Scaffold with Stochastic Gradients: New Analysis with Linear Speed-Up

arXiv.org Machine Learning

This paper proposes a novel analysis for the Scaffold algorithm, a popular method for dealing with data heterogeneity in federated learning. While its convergence in deterministic settings--where local control variates mitigate client drift--is well established, the impact of stochastic gradient updates on its performance is less understood. To address this problem, we first show that its global parameters and control variates define a Markov chain that converges to a stationary distribution in the Wasserstein distance. Leveraging this result, we prove that Scaffold achieves linear speed-up in the number of clients up to higher-order terms in the step size. Nevertheless, our analysis reveals that Scaffold retains a higher-order bias, similar to FedAvg, that does not decrease as the number of clients increases. This highlights opportunities for developing improved stochastic federated learning algorithms


InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery

arXiv.org Artificial Intelligence

The rapid evolution of artificial intelligence in drug discovery encounters challenges with generalization and extensive training, yet Large Language Models (LLMs) offer promise in reshaping interactions with complex molecular data. Our novel contribution, InstructMol, a multi-modal LLM, effectively aligns molecular structures with natural language via an instruction-tuning approach, utilizing a two-stage training strategy that adeptly combines limited domain-specific data with molecular and textual information. InstructMol showcases substantial performance improvements in drug discovery-related molecular tasks, surpassing leading LLMs and significantly reducing the gap with specialized models, thereby establishing a robust foundation for a versatile and dependable drug discovery assistant.


$\mathsf{G^2Retro}$ as a Two-Step Graph Generative Models for Retrosynthesis Prediction

arXiv.org Artificial Intelligence

Retrosynthesis is a procedure where a target molecule is transformed into potential reactants and thus the synthesis routes can be identified. Recently, computational approaches have been developed to accelerate the design of synthesis routes. In this paper, we develop a generative framework $\mathsf{G^2Retro}$ for one-step retrosynthesis prediction. $\mathsf{G^2Retro}$ imitates the reversed logic of synthetic reactions. It first predicts the reaction centers in the target molecules (products), identifies the synthons needed to assemble the products, and transforms these synthons into reactants. $\mathsf{G^2Retro}$ defines a comprehensive set of reaction center types, and learns from the molecular graphs of the products to predict potential reaction centers. To complete synthons into reactants, $\mathsf{G^2Retro}$ considers all the involved synthon structures and the product structures to identify the optimal completion paths, and accordingly attaches small substructures sequentially to the synthons. Here we show that $\mathsf{G^2Retro}$ is able to better predict the reactants for given products in the benchmark dataset than the state-of-the-art methods.


Machine learning frontier orbital energies of nanodiamonds

arXiv.org Artificial Intelligence

Nanodiamonds have a wide range of applications including catalysis, sensing, tribology and biomedicine. To leverage nanodiamond design via machine learning, we introduce the new dataset ND5k, consisting of 5,089 diamondoid and nanodiamond structures and their frontier orbital energies. ND5k structures are optimized via tight-binding density functional theory (DFTB) and their frontier orbital energies are computed using density functional theory (DFT) with the PBE0 hybrid functional. We also compare recent machine learning models for predicting frontier orbital energies for similar structures as they have been trained on (interpolation on ND5k), and we test their abilities to extrapolate predictions to larger structures. For both the interpolation and extrapolation task, we find best performance using the equivariant graph neural network PaiNN. The second best results are achieved with a message passing neural network using a tailored set of atomic descriptors proposed here.


Scalable Fragment-Based 3D Molecular Design with Reinforcement Learning

arXiv.org Artificial Intelligence

Machine learning has the potential to automate molecular design and drastically accelerate the discovery of new functional compounds. Towards this goal, generative models and reinforcement learning (RL) using string and graph representations have been successfully used to search for novel molecules. However, these approaches are limited since their representations ignore the three-dimensional (3D) structure of molecules. In fact, geometry plays an important role in many applications in inverse molecular design, especially in drug discovery. Thus, it is important to build models that can generate molecular structures in 3D space based on property-oriented geometric constraints. To address this, one approach is to generate molecules as 3D point clouds by sequentially placing atoms at locations in space -- this allows the process to be guided by physical quantities such as energy or other properties. However, this approach is inefficient as placing individual atoms makes the exploration unnecessarily deep, limiting the complexity of molecules that can be generated. Moreover, when optimizing a molecule, organic and medicinal chemists use known fragments and functional groups, not single atoms. We introduce a novel RL framework for scalable 3D design that uses a hierarchical agent to build molecules by placing molecular substructures sequentially in 3D space, thus attempting to build on the existing human knowledge in the field of molecular design. In a variety of experiments with different substructures, we show that our agent, guided only by energy considerations, can efficiently learn to produce molecules with over 100 atoms from many distributions including drug-like molecules, organic LED molecules, and biomolecules.


Keeping it Simple: Language Models can learn Complex Molecular Distributions

arXiv.org Artificial Intelligence

Deep generative models of molecules have grown immensely in popularity, trained on relevant datasets, these models are used to search through chemical space. The downstream utility of generative models for the inverse design of novel functional compounds depends on their ability to learn a training distribution of molecules. The most simple example is a language model that takes the form of a recurrent neural network and generates molecules using a string representation. More sophisticated are graph generative models, which sequentially construct molecular graphs and typically achieve state of the art results. However, recent work has shown that language models are more capable than once thought, particularly in the low data regime. In this work, we investigate the capacity of simple language models to learn distributions of molecules. For this purpose, we introduce several challenging generative modeling tasks by compiling especially complex distributions of molecules. On each task, we evaluate the ability of language models as compared with two widely used graph generative models. The results demonstrate that language models are powerful generative models, capable of adeptly learning complex molecular distributions -- and yield better performance than the graph models. Language models can accurately generate: distributions of the highest scoring penalized LogP molecules in ZINC15, multi-modal molecular distributions as well as the largest molecules in PubChem.


Deep Learning for UV Absorption Spectra with SchNarc: First Steps Towards Transferability in Chemical Compound Space

arXiv.org Machine Learning

Machine learning (ML) has shown to advance the research field of quantum chemistry in almost any possible direction and has recently also entered the excited states to investigate the multifaceted photochemistry of molecules. In this paper, we pursue two goals: i) We show how ML can be used to model permanent dipole moments for excited states and transition dipole moments by adapting the charge model of [Chem. Sci., 2017, 8, 6924-6935], which was originally proposed for the permanent dipole moment vector of the electronic ground state. ii) We investigate the transferability of our excited-state ML models in chemical space, i.e., whether an ML model can predict properties of molecules that it has never been trained on and whether it can learn the different excited states of two molecules simultaneously. To this aim, we employ and extend our previously reported SchNarc approach for excited-state ML. We calculate UV absorption spectra from excited-state energies and transition dipole moments as well as electrostatic potentials from latent charges inferred by the ML model of the permanent dipole moment vectors. We train our ML models on CH$_2$NH$_2^+$ and C$_2$H$_4$, while predictions are carried out for these molecules and additionally for CHNH$_2$, CH$_2$NH, and C$_2$H$_5^+$. The results indicate that transferability is possible for the excited states.